OntoMiner: Bootstrapping and Populating Ontologies from Domain Specific Web Sites
نویسندگان
چکیده
HTML documents, which are designed primarily for human consumption. The presence of such legacy documents makes embracing the Semantic Web vision difficult.2 Thus, we need scalable solutions to automatically transform legacy HTML to Semantic Web documents. Recent work describes algorithms that automatically annotate HTML documents with semantic labels.3 Unfortunately, constructing the domain ontologies that drive these algorithms is human intensive. Bootstrapping and populating large, rich, and up-to-date domain ontologies that organize the most relevant concepts, their relationships, and instances (which correspond to members of concepts) can considerably enhance the scalability of transforming legacy HTML into Semantic Web documents. Our system, OntoMiner, offers automated techniques for creating such ontologies based on a small collection of relevant Web sites. Users can then employ the ontologies to create a rich set of labeled examples that a supervised machine-learning system such as WebKB can use.4
منابع مشابه
OntoMiner: automated metadata and instance mining from news websites
RDF/XML has been widely recognised as the standard for annotating online web documents and for transforming the HTML web into the so-called Semantic Web. In order to enable widespread usability of the Semantic Web, there is a need to bootstrap large, rich and up-to-date domain ontologies that organise the most relevant concepts, their relationships and instances. In this paper, we present autom...
متن کاملEnriching an Academic Knowledge base using Linked Open Data
In this paper we present work done towards populating a domain ontology using a public knowledge base like DBpedia. Using an academic ontology as our target we identify mappings between a subset of its predicates and those in DBpedia and other linked datasets. In the semantic web context, ontology mapping allows linking of independently developed ontologies and inter-operation of heterogeneous ...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملBootstrapping Domain Ontology for Semantic Web Services from Source Web Sites
The vision of Semantic Web services promises a network of interoperable Web services over different sources. A major challenge to the realization of this vision is the lack of automated means of acquiring domain ontologies necessary for marking up the Web services. In this paper, we propose the DeepMiner system which learns domain ontologies from the source Web sites. Given a set of sources in ...
متن کاملSemantic Turkey: A browser-integrated environment for knowledge acquisition and management
Born four years ago as a Semantic Web extension for the web browser Firefox, Semantic Turkey pushed forward the traditional concept of links&folders-based bookmarking to a new dimension, allowing users to keep track of relevant information from visited web sites and to organize the collected content according to standard or personally defined ontologies. Today, the tool has broken the boundarie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003